25 research outputs found

    Capturing the ‘ome’ : the expanding molecular toolbox for RNA and DNA library construction

    Get PDF
    All sequencing experiments and most functional genomics screens rely on the generation of libraries to comprehensively capture pools of targeted sequences. In the past decade especially, driven by the progress in the field of massively parallel sequencing, numerous studies have comprehensively assessed the impact of particular manipulations on library complexity and quality, and characterized the activities and specificities of several key enzymes used in library construction. Fortunately, careful protocol design and reagent choice can substantially mitigate many of these biases, and enable reliable representation of sequences in libraries. This review aims to guide the reader through the vast expanse of literature on the subject to promote informed library generation, independent of the application

    Genome dynamics of the human embryonic kidney 293 lineage in response to cell biology manipulations

    Get PDF
    The HEK293 human cell lineage is widely used in cell biology and biotechnology. Here we use whole-genome resequencing of six 293 cell lines to study the dynamics of this aneuploid genome in response to the manipulations used to generate common 293 cell derivatives, such as transformation and stable clone generation (293T); suspension growth adaptation (293S); and cytotoxic lectin selection (293SG). Remarkably, we observe that copy number alteration detection could identify the genomic region that enabled cell survival under selective conditions (i.c. ricin selection). Furthermore, we present methods to detect human/vector genome breakpoints and a user-friendly visualization tool for the 293 genome data. We also establish that the genome structure composition is in steady state for most of these cell lines when standard cell culturing conditions are used. This resource enables novel and more informed studies with 293 cells, and we will distribute the sequenced cell lines to this effect

    Screening proteomes for secretable fragments in yeast

    No full text

    S. cerevisiae fragment datasets

    No full text
    <div>Results data of two-round SECRiFY screens of human (HEK293) cDNA fragments in S. cerevisiae (3 replicate screens). Sequencing data was mapped on the human GRCh38 transcriptome assembled using known transcripts from protein-coding genes only.<br>"Sc_resultstable_all.txt" = all in-frame fragments detected in either the unsorted baseline library (merged for 3 replicates), or in all 3 sorted replicate samples. <br>"Sc_resultstable_enriched.txt" = those with log_FC > 1 in all three replicates (11625)<br>"Sc_resultstable_depleted.txt" = those with log_FC < -1 in all three replicates (136531)<br><br>For each fragment, the following information is provided:<br><br># Ensembl_geneID --> Ensembl gene ID<br># Ensembl_txID --> Ensembl transcript ID<br># tx_start --> Transcript start position on human genome GRCh38<br># tx_end --> Transcript end position on human genome GRCh38<br># chr --> chromosome #<br># gene_symbol --> official gene symbol<br># frag_start --> fragment start position on the transcript, 0-based <br># frag_stop --> fragment end position on the transcript, 0-based<br># cDNA --> DNA sequence of the fragment<br># protein --> translated AA sequence of the fragment in frame 1<br># IND_count --> raw count value in the baseline (unsorted) library (merged for 3 replicates), NAs replaced by 0.001<br># SORT(1)_count --> raw count value in the sorted sample replicate (1), NAs replaced by 0.001<br># IND_FPTM --> normalized FPTM value in the baseline (unsorted) library <br># SORT(1)_FPTM --> normalized FPTM value in the sorted sample replicate (1)<br># logFC_(1) --> log2(SORT(1)_FPTM/IND_FPTM)</div><br

    Domain hits Pfam

    No full text
    Domain hits of human representative fragments identified in <i>S. cerevisiae</i> or <i>P. pastoris</i> SECRiFY screening. 'Common' indicates domains found in both enriched and depleted fragments, 'Unique' domains are found in enriched or depleted fragments exclusively.<br

    SECRiFY R/bash code

    No full text
    R/bash scripts used for sequencing data processing and analysis of SECRiFY fragment sequences<br

    P. pastoris fragment datasets

    No full text
    <div>Results data of two-round SECRiFY screens of human (HEK293) cDNA fragments in P. pastoris (3 replicate screens). Sequencing data was mapped on the human GRCh38 transcriptome assembled using known transcripts from protein-coding genes only.<br>"Pp_resultstable_all.txt" = all in-frame fragments detected in either the unsorted baseline library (merged for 3 replicates), or in all 3 sorted replicate samples. <br>"Pp_resultstable_enriched.txt" = those with log_FC > 1 in all three replicates (10404)<br>"Pp_resultstable_depleted.txt" = those with log_FC < -1 in all three replicates (141357)<br><br>For each fragment, the following information is provided:<br><br># Ensembl_geneID --> Ensembl gene ID<br># Ensembl_txID --> Ensembl transcript ID<br># tx_start --> Transcript start position on human genome GRCh38<br># tx_end --> Transcript end position on human genome GRCh38<br># chr --> chromosome #<br># gene_symbol --> official gene symbol<br># frag_start --> fragment start position on the transcript, 0-based <br># frag_stop --> fragment end position on the transcript, 0-based<br># cDNA --> DNA sequence of the fragment<br># protein --> translated AA sequence of the fragment in frame 1<br># IND_count --> raw count value in the baseline (unsorted) library (merged for 3 replicates), NAs replaced by 0.001<br># SORT(1)_count --> raw count value in the sorted sample replicate (1), NAs replaced by 0.001<br># IND_FPTM --> normalized FPTM value in the baseline (unsorted) library <br># SORT(1)_FPTM --> normalized FPTM value in the sorted sample replicate (1)<br># logFC_(1) --> log2(SORT(1)_FPTM/IND_FPTM)</div

    Human transcriptome GRCh38 known protein-coding

    No full text
    Fasta file used for mapping of sequenced SECRiFY fragments<br
    corecore